Jagged Tiling for Intra-tile Parallelism and Fine-Grain Multithreading
نویسندگان
چکیده
In this paper, we have developed a novel methodology that takes into consideration multithreaded many-core designs to better utilize memory/processing resources and improve memory residence on tileable applications. It takes advantage of polyhedral analysis and transformation in the form of PLUTO[6], combined with a highly optimized fine grain tile runtime to exploit parallelism at all levels. The main contributions of this paper include the introduction of multi-hierarchical tiling techniques that increases intra tile parallelism; and a data-flow inspired runtime library that allows the expression of parallel tiles with an efficient synchronization registry. Our current implementation shows performance improvements on an Intel Xeon Phi board up to 32.25% against instances produced by state-of-the-art compiler frameworks for selected stencil applications.
منابع مشابه
A fine-grain multithreading superscalar architecture
In this study we show that fine-grain multithreading is an effective way to increase instruction-level parallelism and hide the latencies of long-latency operations in a superscalar processor. The effects of long-latency operations, such as remote memory references, cachemisses, and multi-cycle floating-point calculations, are detrimental to performance since such operations typically cause a s...
متن کاملA Feasibility Study of Hierarchical Multithreading
Many studies have shown that significant amounts of parallelism exist at different granularities. Execution models such as superscalar and VLIW exploit parallelism from a single thread. Multithreaded processors make a step towards exploiting parallelism from different threads, but are not geared to exploit parallelism at different granularities (fine and medium grain). In this paper we present ...
متن کاملSimultaneous Multithreading: Maximizing On-Chip Parallelism - Computer Architecture, 1995. Proceedings., 22nd Annual International Symposium on
This paper examines simultaneous multithreading, a technique permitting several independent threads to issue instructions to a superscalar's multiple functional units in a single cycle. We present several models of simultaneous multithreading and compare them with altemative organizations: a wide superscalar, a fine-grain multithreaded processor, and single-chip, multiple-issue multiprocessing ...
متن کاملEfficient Fine-Grain Synchronization on a Multi-Core Chip Architecture: A Fresh Look
Multi-core chip architectures are becoming mainstream, permitting increasing on-chip parallelism through hardware support for multithreading. Fine-grain synchronization is essential to the effective utilization of the capacity provided by future high-performance multi-core architectures. However, there are also new challenges realizing such fine-grain synchronization in large-scale multi-core c...
متن کاملW.m. Zuberek: Performance of Fine-grain Multithreaded Multiprocessors Performance Analysis of Fine–grain Multithreaded Multiprocessors
Instruction–level multithreading is an architectural approach to tolerating long–latency memory accesses and synchronization delays in distributed–memory systems. The paper presents a timed Petri net model of a fine–grain multithreaded distributed–memory multiprocessor system at the instruction execution level, and illustrates performance analysis by results obtained from simulation of the deri...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014